An empirical study on the effectiveness of data resampling approaches for cross‐project software defect prediction
نویسندگان
چکیده
Cross-project defect prediction (CPDP), where data from different software projects are used to predict defects, has been proposed as a way provide for that lack historical data. Evaluations of CPDP models using the Nearest Neighbour (NN) Filter approach have shown promising results in recent studies. A key challenge with defect-prediction datasets is class imbalance, is, highly skewed non-buggy modules dominate buggy modules. In past, resampling approaches applied within-projects help alleviate negative effects imbalance datasets. To address issue CPDP, authors assess impact on after NN applied. The performance five oversampling (MAHAKIL, SMOTE, Borderline-SMOTE, Random Oversampling and ADASYN) three undersampling (Random Undersampling, Tomek Links One-sided selection) investigated compared without resampling. examined six 34 extracted PROMISE repository. authors' show there significant positive effect performance, suggesting quality teams researchers should consider applying improved recall (pd) g-measure performance. However, if goal improve precision reduce false alarm (pf) then be avoided.
منابع مشابه
a study on the effectiveness of textual modification on the improvement of iranian upper-intermediate efl learners’ reading comprehension
این پژوهش به منظور بررسی تأثیر اصلاح متنی بر بهبود توانایی درک مطلب زبان آموزان ایرانی بالاتر از سطح میانی انجام پذیرفت .بدین منظور 115 دانشجوی مرد و زن رشته مترجمی زبان انگلیسی در این پزوهش شرکت نمودند.
a study on the effectiveness of task types (noticing-reformulation) on iranian low intermediate efl learners’ retention of collocations
چکیده پژوهش شبه تجربی حاضر به بررسی بکارگیری تمارین کلاسی که برانگیزنده آگاهی و توجه آگاهانه به همایندها بعنوان بخشی از یک دوره ی مکالمه زبان خارجی در یکی از آموزشگاه های زبان انگلیسی ایران است می پردازد.
a study on insurer solvency by panel data model: the case of iranian insurance market
the aim of this thesis is an approach for assessing insurer’s solvency for iranian insurance companies. we use of economic data with both time series and cross-sectional variation, thus by using the panel data model will survey the insurer solvency.
An empirical study on software defect prediction with a simplified metric set
Context: Software defect prediction plays a crucial role in estimating the most defect-prone components of software, and a large number of studies have pursued improving prediction accuracy within a project or across projects. However, the rules for making an appropriate decision between withinand cross-project defect prediction when available historical data are insufficient remain unclear. Ob...
متن کاملstudy of cohesive devices in the textbook of english for the students of apsychology by rastegarpour
this study investigates the cohesive devices used in the textbook of english for the students of psychology. the research questions and hypotheses in the present study are based on what frequency and distribution of grammatical and lexical cohesive devices are. then, to answer the questions all grammatical and lexical cohesive devices in reading comprehension passages from 6 units of 21units th...
ذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IET Software
سال: 2021
ISSN: ['1751-8806', '1751-8814']
DOI: https://doi.org/10.1049/sfw2.12052